Biomedical Semantic Indexing using Dense Word Vectors in BioASQ
نویسندگان
چکیده
Background: Biomedical curators are often required to semantically index large numbers of biomedical articles, using hierarchically related labels (e.g., MeSH headings). Large scale hierarchical classification, a branch of machine learning, can facilitate this procedure, but the resulting automatic classifiers are often inefficient because of the very large dimensionality of the dominant bag-of-words representation of texts. Feature selection quickly harms the accuracy of the classifiers in this particular task, and dimensionality reduction transformations (e.g., PCA-based) usually cannot be efficiently applied to very large corpora.
منابع مشابه
BioASQ: A Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering
This article provides an overview of BIOASQ, a new competition on biomedical semantic indexing and question answering (QA). BIOASQ aims to push towards systems that will allow biomedical workers to express their information needs in natural language and that will return concise and user-understandable answers by combining information from multiple sources of different kinds, including biomedica...
متن کاملLarge-Scale Semantic Indexing of Biomedical Publications
Automated annotation of scientific publications in real-world digital libraries requires dealing with challenges such as large number of concepts and training examples, multi-label training examples and hierarchical structure of concepts. BioASQ is a European project that contributes a large-scale biomedical publications corpus for working on these challenges. This paper documents the participa...
متن کاملLarge-scale online semantic indexing of biomedical articles via an ensemble of multi-label classification models
BACKGROUND In this paper we present the approach that we employed to deal with large scale multi-label semantic indexing of biomedical papers. This work was mainly implemented within the context of the BioASQ challenge (2013-2017), a challenge concerned with biomedical semantic indexing and question answering. METHODS Our main contribution is a MUlti-Label Ensemble method (MULE) that incorpor...
متن کاملClassification and Retrieval of Biomedical Literatures: SNUMedinfo at CLEF QA track BioASQ 2014
This paper describes the participation of the SNUMedinfo team at the BioASQ Task 2a and Task 2b of CLEF 2014 Question Answering track. Task 2a was about biomedical semantic indexing. We trained SVM classifiers to automatically assign relevant MeSH descriptors to the MEDLINE article. Regarding Task 2b biomedical question answering, we participated at the document retrieval subtask in Phase A and...
متن کاملIIITH at BioASQ Challenge 2015 Task 3a: Extreme Classification of PubMed Articles using MeSH Labels
Automating the process of indexing journal abstracts has been a topic of research for several years. Biomedical Semantic Indexing aims to assign correct MeSH terms to the PubMed documents. In this paper we report our participation in the Task 3a of BioASQ challenge 2015. The participating teams were provided with PubMed articles and asked to return relevant MeSH terms. We tried three different ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015